245 research outputs found
JALAD: Joint Accuracy- and Latency-Aware Deep Structure Decoupling for Edge-Cloud Execution
Recent years have witnessed a rapid growth of deep-network based services and
applications. A practical and critical problem thus has emerged: how to
effectively deploy the deep neural network models such that they can be
executed efficiently. Conventional cloud-based approaches usually run the deep
models in data center servers, causing large latency because a significant
amount of data has to be transferred from the edge of network to the data
center. In this paper, we propose JALAD, a joint accuracy- and latency-aware
execution framework, which decouples a deep neural network so that a part of it
will run at edge devices and the other part inside the conventional cloud,
while only a minimum amount of data has to be transferred between them. Though
the idea seems straightforward, we are facing challenges including i) how to
find the best partition of a deep structure; ii) how to deploy the component at
an edge device that only has limited computation power; and iii) how to
minimize the overall execution latency. Our answers to these questions are a
set of strategies in JALAD, including 1) A normalization based in-layer data
compression strategy by jointly considering compression rate and model
accuracy; 2) A latency-aware deep decoupling strategy to minimize the overall
execution latency; and 3) An edge-cloud structure adaptation strategy that
dynamically changes the decoupling for different network conditions.
Experiments demonstrate that our solution can significantly reduce the
execution latency: it speeds up the overall inference execution with a
guaranteed model accuracy loss.Comment: conference, copyright transfered to IEE
AdaCompress: Adaptive Compression for Online Computer Vision Services
With the growth of computer vision based applications and services, an
explosive amount of images have been uploaded to cloud servers which host such
computer vision algorithms, usually in the form of deep learning models. JPEG
has been used as the {\em de facto} compression and encapsulation method before
one uploads the images, due to its wide adaptation. However, standard JPEG
configuration does not always perform well for compressing images that are to
be processed by a deep learning model, e.g., the standard quality level of JPEG
leads to 50\% of size overhead (compared with the best quality level selection)
on ImageNet under the same inference accuracy in popular computer vision models
including InceptionNet, ResNet, etc. Knowing this, designing a better JPEG
configuration for online computer vision services is still extremely
challenging: 1) Cloud-based computer vision models are usually a black box to
end-users; thus it is difficult to design JPEG configuration without knowing
their model structures. 2) JPEG configuration has to change when different
users use it. In this paper, we propose a reinforcement learning based JPEG
configuration framework. In particular, we design an agent that adaptively
chooses the compression level according to the input image's features and
backend deep learning models. Then we train the agent in a reinforcement
learning way to adapt it for different deep learning cloud services that act as
the {\em interactive training environment} and feeding a reward with
comprehensive consideration of accuracy and data size. In our real-world
evaluation on Amazon Rekognition, Face++ and Baidu Vision, our approach can
reduce the size of images by 1/2 -- 1/3 while the overall classification
accuracy only decreases slightly.Comment: ACM Multimedi
Interpretable and Efficient Beamforming-Based Deep Learning for Single Snapshot DOA Estimation
We introduce an interpretable deep learning approach for direction of arrival
(DOA) estimation with a single snapshot. Classical subspace-based methods like
MUSIC and ESPRIT use spatial smoothing on uniform linear arrays for single
snapshot DOA estimation but face drawbacks in reduced array aperture and
inapplicability to sparse arrays. Single-snapshot methods such as compressive
sensing and iterative adaptation approach (IAA) encounter challenges with high
computational costs and slow convergence, hampering real-time use. Recent deep
learning DOA methods offer promising accuracy and speed. However, the practical
deployment of deep networks is hindered by their black-box nature. To address
this, we propose a deep-MPDR network translating minimum power distortionless
response (MPDR)-type beamformer into deep learning, enhancing generalization
and efficiency. Comprehensive experiments conducted using both simulated and
real-world datasets substantiate its dominance in terms of inference time and
accuracy in comparison to conventional methods. Moreover, it excels in terms of
efficiency, generalizability, and interpretability when contrasted with other
deep learning DOA estimation networks.Comment: 10 pages, 10 figure
SCPAT-GAN: Structural Constrained and Pathology Aware Convolutional Transformer-GAN for Virtual Histology Staining of Human Coronary OCT images
There is a significant need for the generation of virtual histological
information from coronary optical coherence tomography (OCT) images to better
guide the treatment of coronary artery disease. However, existing methods
either require a large pixel-wisely paired training dataset or have limited
capability to map pathological regions. To address these issues, we proposed a
structural constrained, pathology aware, transformer generative adversarial
network, namely SCPAT-GAN, to generate virtual stained H&E histology from OCT
images. The proposed SCPAT-GAN advances existing methods via a novel design to
impose pathological guidance on structural layers using transformer-based
network.Comment: 9 pages, 4 figure
- …